Low Hanging Fruits

to Increase the Reproducibility of Your Research




Jürgen Schneider

12 February 2026

Why reproducibility matters

RESEARCHER A:
“I baked this cake…

  • with these ingredients
  • and this recipe

YOU:
“I want that too! So I’ll

  • use the same ingredients
  • and use the same recipe

Why reproducibility matters

RESEARCHER A:
“I baked this cake…

  • with these ingredients
  • and this recipe

YOU:
“I want that too! So I’ll

  • use the same ingredients
  • and use the same recipe

Why reproducibility matters

What is Reproducibility?



Same Data Different Data
Same Analysis
 
Reproducible Replicable
Different Analysis Robust Generalizable

(NAS, 2018, p. 46)

The cumulative nature of science fundamentally depends on researchers building upon others’ findings (merton.1973?)


“In principle, all reported evidence should be reproducible” (Nosek et al., 2022, p. 721)

Why reproducibility matters

“Isn’t that a given?”

Artner et al. (2021): 232 scientific claims from 46 journal articles

Why reproducibility matters

“Isn’t that a given?” - Why not?

Crüwell et al. (2023):
All articles from one issue in Psychological Science


Hardwicke et al. (2021):
Articles, open data badge (Psychological Science, 2014-2015)


Obels et al. (2020):
36 registered reports that shared both, code and data

Renaming files
Hard-coding file paths

copy-paste errors
wrong rounding

Old package versions
Non-standardized computational environment (e.g., Older software versions)

(Batinovic & Carlsson, 2023)

Why reproducibility matters

High cost, if not reproducible


(Artner et al., 2021, p. 12)

Why reproducibility matters

Everyday situations in your research process

Joint analyses
with co-authors

Reviewers check
your analyses.

Further analysis
after review

Recalculation
for meta-analysis


System-independent executable


Executable
free of charge


Comprehensible for yourself and others


Executable
over the long term

Compute
environment
control

Cost-free
software

Literate
programming

Compute
environment
control

Reproducible Reporting

Basic requirement

I assume that’s a given

  • Share data and analyses (see closed_data() function from WORCS or synthpop package in case you can’t share data)
  • Set up your work as a ‘project’, where all related files (e.g., data, scripts, results) are stored together in a single folder (as with R-projects) or in a single file (as with JASP and jamovi). Avoid working with isolated files, and use relative paths to connect files within the project.
  • Use a clear folder structure and readme files

(peng.2011?)

The Workflow for Open Reproducible Code in Science (WORCS) is an excellent framework that also integrates recommendations I give here. (vanlissa.2021?)

Beyond Code and Data

Remaining challenges:

  • Which code file on which data in which order?
  • Software version differences
  • Operating system dependencies
  • Package version conflicts

Educational research: Often multi-step procedures with various software packages

The Reproducibility Spectrum

Not reproducible ←―――――――――――――→ Gold standard
  1. Publication only
  2. Publication + Code
  3. Publication + Code and Data
  4. Linked, executable code and data
  5. Isolated computational environment (Docker, Binder, Quarto-live)

Literate Programming: The Key

Concept by Knuth (1984):

  • Interweaving natural language and code
  • Human-readable documentation
  • Code, output, and narrative in one document

Modern implementation: - Quarto - R Markdown - Jupyter Notebooks

Example: Quarto Document

Left side: Markdown + Code blocks
Right side: Rendered document

  • Direct traceability from text to computation
  • Every coefficient traceable to its calculation
  • Automatic documentation of analytical decisions
  • Output in .docx, .pdf, .html, and more

Benefits of Reproducible Reporting

For science:

  • Error detection by reviewers and readers
  • Learning from methodological details
  • Building on previous work

For you:

  • Better organized workflow
  • Easier to revise analyses
  • Documented decisions for future reference
  • Facilitates collaboration

Computational Reproducibility

Problem: Different software versions, operating systems

Solutions:

  1. Basic: Provide code + data + session info
  2. Better: Use package management (renv, conda)
  3. Best: Containerization (Docker) or web-based execution (Quarto-live)

Quarto-Live: Full Reproducibility

Three lines of code change:

format: live-html
engine: knitr

Result:

  • Completely isolated computational environment
  • Accessible via browser
  • Executable via WebAssembly
  • No software installation needed
  • No version conflicts possible

FAIR Data Management

Reproducibility Requires Access

Reproducibility Enables Cumulative Science

Science is fundamentally cumulative (Merton 1973):

  1. Verification: Can we reproduce the finding?
  2. Extension: Can we build on this work?
  3. Integration: Can we combine multiple studies?

Each step requires:

  • Access to materials
  • Understanding of methods
  • Ability to reuse data and code

The Cost of Non-Reproducibility

Artner et al. (2021) reproduction study:

  • Attempted to reproduce 232 statistical claims from 46 articles
  • Investment: 280 person-days of work
  • Success rate: Only 70%

Their conclusion: “Vagueness makes assessing reproducibility a nightmare”

The problem: Inadequate documentation and data management

FAIR as Systematic Solution

FAIR Principles provide structure for:

  • Making research products discoverable
  • Ensuring long-term accessibility
  • Enabling technical compatibility
  • Supporting informed reuse

Not just “making data available” but making it systematically reusable

From Reproducible to Reusable

Reproducible reporting solves one problem:

  • “Can I recreate your results with your data and code?”

But another problem remains:

  • “Can I actually find and access your data and code?”
  • “Can I understand what your data contains?”
  • “Can I legally and practically reuse your materials?”

→ FAIR principles address these questions

Why FAIR Matters for Reproducibility

Real examples of reproducibility barriers:

  • Stimulus materials posted but copyright unclear → Cannot reuse
  • Data available but no codebook → Cannot understand variables
  • Referenced data links broken → Cannot access
  • Authors left institution → Cannot locate materials

Availability alone ≠ Reproducibility

FAIR Principles

  • Findable
  • Accessible
  • Interoperable
  • Reusable

Framework for systematic practices that enable reuse

FAIR ≠ Open

  • FAIR does not necessarily mean “freely accessible”
  • Particularly relevant for vulnerable populations
  • Metadata remain publicly accessible
  • Access path transparently documented

Principle: “As open as possible, as closed as necessary”

Empirical Evidence for FAIR

Health research (Martínez-García et al. 2023):

  • 56.6% time savings in research data management
  • Monthly savings: €16,800

Reproducibility study (Artner et al. 2021):

  • 232 statistical claims from 46 articles
  • 280 person-days of work
  • Only 70% successfully reproduced

F - Findable

Components:

  1. Persistent, unique identifiers (DOIs)
  2. Rich metadata
  3. Indexing in searchable databases

Repositories: - Zenodo.org - OSF.io - Research data centers (e.g., Verbund FDB)

A - Accessible

Gradations:

  • Freely available research products
  • Controlled access (private repositories)
  • Regulated access via research data centers

Important: Transparent documentation of access path

I - Interoperable

Use of standardized, open formats:

  • Data: CSV with codebook, labeled .Rdata, ODF
  • Code: R, Python (not MPlus, STATA)
  • Avoiding proprietary software

Alternative open-source software:

  • Jamovi, JASP (instead of SPSS)

R - Reusable

Comprehensive documentation:

  • Codebooks for each variable
  • README files
  • Terms of use and licensing information (CC0, CC-BY, CC-BY-SA)
  • Research context and framework
  • Data quality issues

Standards: PSYCH-DS for psychological data

FAIR in Educational Research

Good examples:

  • National Educational Panel Study (NEPS)
  • PISA
  • PIRLS

Challenge: Scalable approaches for smaller projects with limited resources

Frequently Asked Questions

“Reproducibility sounds time-consuming”

Answer:

Initial investment: Yes, there’s a learning curve

Long-term benefits:

  • Faster revisions (code is already there)
  • Easier collaboration (others can understand your work)
  • Better organized workflow
  • More citations and trust

Think incremental: Start small, improve gradually

“My code is messy and embarrassing”

Answer:

The paradox of open code:

  1. Knowing others will see it → You write better code
  2. Better code → Less error-prone, better documented
  3. Better documentation → Others can learn from it

Perfect is the enemy of good: Imperfect but documented code > No code at all

“What about proprietary/sensitive data?”

Answer:

FAIR ≠ Open:

  • Metadata can be public even if data isn’t
  • Controlled access is still FAIR
  • Document access procedures transparently

For educational research:

  • “As open as possible, as closed as necessary”
  • Research data centers provide secure solutions
  • Synthetic data for methods illustration

“I’ll be scooped!”

Answer:

Your advantages persist:

  • Deep knowledge of your data and design
  • First-mover advantage in publication
  • Invitations to collaborate on reuse

Reframing:

  • Reuse = validation of your work
  • Citations from secondary analyses
  • Broader impact of your research

Science as common good: Collective progress benefits everyone

Conclusion

Reproducibility is not optional—it’s core to science:

  • Verification of findings
  • Error detection and correction
  • Cumulative knowledge building

Two practical approaches:

  1. Reproducible Reporting → Document your complete workflow
  2. FAIR Data Management → Enable systematic reuse

Result: Stronger, more credible, more impactful research

Take-Home Message

Reproducibility means:

  • Publishing isn’t complete without data and code
  • Documentation is as important as analysis
  • FAIR principles make reuse systematic, not accidental

Start today:

  • Try literate programming for your next analysis
  • Deposit your next dataset with a DOI
  • Document one more step than you did last time

Perfect is the enemy of good—begin somewhere! 💪

Thank You for Your Attention!

Questions?

Key References

  • Artner et al. (2021). The reproducibility of statistical results in psychological research. Psychological Methods.
  • Peng (2011). Reproducible research in computing science. Science.
  • Wilkinson et al. (2016). The FAIR Guiding Principles. Scientific Data.
  • Hardwicke et al. (2018, 2021). Data availability and analytic reproducibility studies. Royal Society Open Science.

Complete reference list available in the paper.

References

Artner, R., Verliefde, T., Steegen, S., Gomes, S., Traets, F., Tuerlinckx, F., & Vanpaemel, W. (2021). The reproducibility of statistical results in psychological research: An investigation using unpublished raw data. Psychological Methods, 26(5), 527–546. https://doi.org/10.1037/met0000365
Batinovic, L., & Carlsson, R. (2023, March). Why your code doesn’t reproduce: Lessons learned from Meta-Psychology. Unconference 2023: Open Scholarship Practices in Education Research.
Crüwell, S., Apthorp, D., Baker, B. J., Colling, L., Elson, M., Geiger, S. J., Lobentanzer, S., Monéger, J., Patterson, A., Schwarzkopf, D. S., Zaneva, M., & Brown, N. J. L. (2023). What’s in a Badge? A Computational Reproducibility Investigation of the Open Data Badge Policy in One Issue of Psychological Science. Psychological Science, 34(4), 512–522. https://doi.org/10.1177/09567976221140828
Hardwicke, T. E., Bohn, M., MacDonald, K., Hembacher, E., Nuijten, M. B., Peloquin, B. N., deMayo, B. E., Long, B., Yoon, E. J., & Frank, M. C. (2021). Analytic reproducibility in articles receiving open data badges at the journal Psychological Science : An observational study. Royal Society Open Science, 8(1), 201494. https://doi.org/10.1098/rsos.201494
NAS. (2018). Open Science by Design: Realizing a Vision for 21st Century Research (p. 25116). National Academies Press. https://doi.org/10.17226/25116
Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A., Corker, K. S., Dreber, A., Fidler, F., Hilgard, J., Kline Struhl, M., Nuijten, M. B., Rohrer, J. M., Romero, F., Scheel, A. M., Scherer, L. D., Schönbrodt, F. D., & Vazire, S. (2022). Replicability, Robustness, and Reproducibility in Psychological Science. Annual Review of Psychology, 73(1), 719–748. https://doi.org/10.1146/annurev-psych-020821-114157
Obels, P., Lakens, D., Coles, N. A., Gottfried, J., & Green, S. A. (2020). Analysis of Open Data and Computational Reproducibility in Registered Reports in Psychology. Advances in Methods and Practices in Psychological Science, 3(2), 229–237. https://doi.org/10.1177/2515245920918872

Credit

Title image by rael frames on Unsplash

Olaf (from Frozen): dailymail.co.uk

FAIR-Logo: SangyaPundir on wikimedia commons